37 research outputs found

    Identification of polymorphic inversions from genotypes

    Get PDF
    Background: Polymorphic inversions are a source of genetic variability with a direct impact on recombination frequencies. Given the difficulty of their experimental study, computational methods have been developed to infer their existence in a large number of individuals using genome-wide data of nucleotide variation. Methods based on haplotype tagging of known inversions attempt to classify individuals as having a normal or inverted allele. Other methods that measure differences between linkage disequilibrium attempt to identify regions with inversions but unable to classify subjects accurately, an essential requirement for association studies. Results: We present a novel method to both identify polymorphic inversions from genome-wide genotype data and classify individuals as containing a normal or inverted allele. Our method, a generalization of a published method for haplotype data [1], utilizes linkage between groups of SNPs to partition a set of individuals into normal and inverted subpopulations. We employ a sliding window scan to identify regions likely to have an inversion, and accumulation of evidence from neighboring SNPs is used to accurately determine the inversion status of each subject. Further, our approach detects inversions directly from genotype data, thus increasing its usability to current genome-wide association studies (GWAS). Conclusions: We demonstrate the accuracy of our method to detect inversions and classify individuals on principled-simulated genotypes, produced by the evolution of an inversion event within a coalescent model [2]. We applied our method to real genotype data from HapMap Phase III to characterize the inversion status of two known inversions within the regions 17q21 and 8p23 across 1184 individuals. Finally, we scan the full genomes of the European Origin (CEU) and Yoruba (YRI) HapMap samples. We find population-based evidence for 9 out of 15 well-established autosomic inversions, and for 52 regions previously predicted by independent experimental methods in ten (9+1) individuals [3,4]. We provide efficient implementations of both genotype and haplotype methods as a unified R package inveRsion

    On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

    Get PDF
    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

    Algebraic Distribution of Segmental Duplication Lengths in Whole-Genome Sequence Self-Alignments

    Get PDF
    Distributions of duplicated sequences from genome self-alignment are characterized, including forward and backward alignments in bacteria and eukaryotes. A Markovian process without auto-correlation should generate an exponential distribution expected from local effects of point mutation and selection on localised function; however, the observed distributions show substantial deviation from exponential form – they are roughly algebraic instead – suggesting a novel kind of long-distance correlation that must be non-local in origin

    Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015

    Get PDF
    SummaryBackground The Global Burden of Diseases, Injuries, and Risk Factors Study 2015 provides an up-to-date synthesis of the evidence for risk factor exposure and the attributable burden of disease. By providing national and subnational assessments spanning the past 25 years, this study can inform debates on the importance of addressing risks in context. Methods We used the comparative risk assessment framework developed for previous iterations of the Global Burden of Disease Study to estimate attributable deaths, disability-adjusted life-years (DALYs), and trends in exposure by age group, sex, year, and geography for 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks from 1990 to 2015. This study included 388 risk-outcome pairs that met World Cancer Research Fund-defined criteria for convincing or probable evidence. We extracted relative risk and exposure estimates from randomised controlled trials, cohorts, pooled cohorts, household surveys, census data, satellite data, and other sources. We used statistical models to pool data, adjust for bias, and incorporate covariates. We developed a metric that allows comparisons of exposure across risk factors—the summary exposure value. Using the counterfactual scenario of theoretical minimum risk level, we estimated the portion of deaths and DALYs that could be attributed to a given risk. We decomposed trends in attributable burden into contributions from population growth, population age structure, risk exposure, and risk-deleted cause-specific DALY rates. We characterised risk exposure in relation to a Socio-demographic Index (SDI). Findings Between 1990 and 2015, global exposure to unsafe sanitation, household air pollution, childhood underweight, childhood stunting, and smoking each decreased by more than 25%. Global exposure for several occupational risks, high body-mass index (BMI), and drug use increased by more than 25% over the same period. All risks jointly evaluated in 2015 accounted for 57·8% (95% CI 56·6–58·8) of global deaths and 41·2% (39·8–42·8) of DALYs. In 2015, the ten largest contributors to global DALYs among Level 3 risks were high systolic blood pressure (211·8 million [192·7 million to 231·1 million] global DALYs), smoking (148·6 million [134·2 million to 163·1 million]), high fasting plasma glucose (143·1 million [125·1 million to 163·5 million]), high BMI (120·1 million [83·8 million to 158·4 million]), childhood undernutrition (113·3 million [103·9 million to 123·4 million]), ambient particulate matter (103·1 million [90·8 million to 115·1 million]), high total cholesterol (88·7 million [74·6 million to 105·7 million]), household air pollution (85·6 million [66·7 million to 106·1 million]), alcohol use (85·0 million [77·2 million to 93·0 million]), and diets high in sodium (83·0 million [49·3 million to 127·5 million]). From 1990 to 2015, attributable DALYs declined for micronutrient deficiencies, childhood undernutrition, unsafe sanitation and water, and household air pollution; reductions in risk-deleted DALY rates rather than reductions in exposure drove these declines. Rising exposure contributed to notable increases in attributable DALYs from high BMI, high fasting plasma glucose, occupational carcinogens, and drug use. Environmental risks and childhood undernutrition declined steadily with SDI; low physical activity, high BMI, and high fasting plasma glucose increased with SDI. In 119 countries, metabolic risks, such as high BMI and fasting plasma glucose, contributed the most attributable DALYs in 2015. Regionally, smoking still ranked among the leading five risk factors for attributable DALYs in 109 countries; childhood underweight and unsafe sex remained primary drivers of early death and disability in much of sub-Saharan Africa. Interpretation Declines in some key environmental risks have contributed to declines in critical infectious diseases. Some risks appear to be invariant to SDI. Increasing risks, including high BMI, high fasting plasma glucose, drug use, and some occupational exposures, contribute to rising burden from some conditions, but also provide opportunities for intervention. Some highly preventable risks, such as smoking, remain major causes of attributable DALYs, even as exposure is declining. Public policy makers need to pay attention to the risks that are increasingly major contributors to global burden. Funding Bill & Melinda Gates Foundation

    Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm

    Get PDF
    Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation

    [SWI+], the Prion Formed by the Chromatin Remodeling Factor Swi1, Is Highly Sensitive to Alterations in Hsp70 Chaperone System Activity

    Get PDF
    The yeast prion [SWI+], formed of heritable amyloid aggregates of the Swi1 protein, results in a partial loss of function of the SWI/SNF chromatin-remodeling complex, required for the regulation of a diverse set of genes. Our genetic analysis revealed that [SWI+] propagation is highly dependent upon the action of members of the Hsp70 molecular chaperone system, specifically the Hsp70 Ssa, two of its J-protein co-chaperones, Sis1 and Ydj1, and the nucleotide exchange factors of the Hsp110 family (Sse1/2). Notably, while all yeast prions tested thus far require Sis1, [SWI+] is the only one known to require the activity of Ydj1, the most abundant J-protein in yeast. The C-terminal region of Ydj1, which contains the client protein interaction domain, is required for [SWI+] propagation. However, Ydj1 is not unique in this regard, as another, closely related J-protein, Apj1, can substitute for it when expressed at a level approaching that of Ydj1. While dependent upon Ydj1 and Sis1 for propagation, [SWI+] is also highly sensitive to overexpression of both J-proteins. However, this increased prion-loss requires only the highly conserved 70 amino acid J-domain, which serves to stimulate the ATPase activity of Hsp70 and thus to stabilize its interaction with client protein. Overexpression of the J-domain from Sis1, Ydj1, or Apj1 is sufficient to destabilize [SWI+]. In addition, [SWI+] is lost upon overexpression of Sse nucleotide exchange factors, which act to destabilize Hsp70's interaction with client proteins. Given the plethora of genes affected by the activity of the SWI/SNF chromatin-remodeling complex, it is possible that this sensitivity of [SWI+] to the activity of Hsp70 chaperone machinery may serve a regulatory role, keeping this prion in an easily-lost, meta-stable state. Such sensitivity may provide a means to reach an optimal balance of phenotypic diversity within a cell population to better adapt to stressful environments

    Mortality from gastrointestinal congenital anomalies at 264 hospitals in 74 low-income, middle-income, and high-income countries: a multicentre, international, prospective cohort study

    Get PDF
    Background: Congenital anomalies are the fifth leading cause of mortality in children younger than 5 years globally. Many gastrointestinal congenital anomalies are fatal without timely access to neonatal surgical care, but few studies have been done on these conditions in low-income and middle-income countries (LMICs). We compared outcomes of the seven most common gastrointestinal congenital anomalies in low-income, middle-income, and high-income countries globally, and identified factors associated with mortality. // Methods: We did a multicentre, international prospective cohort study of patients younger than 16 years, presenting to hospital for the first time with oesophageal atresia, congenital diaphragmatic hernia, intestinal atresia, gastroschisis, exomphalos, anorectal malformation, and Hirschsprung's disease. Recruitment was of consecutive patients for a minimum of 1 month between October, 2018, and April, 2019. We collected data on patient demographics, clinical status, interventions, and outcomes using the REDCap platform. Patients were followed up for 30 days after primary intervention, or 30 days after admission if they did not receive an intervention. The primary outcome was all-cause, in-hospital mortality for all conditions combined and each condition individually, stratified by country income status. We did a complete case analysis. // Findings: We included 3849 patients with 3975 study conditions (560 with oesophageal atresia, 448 with congenital diaphragmatic hernia, 681 with intestinal atresia, 453 with gastroschisis, 325 with exomphalos, 991 with anorectal malformation, and 517 with Hirschsprung's disease) from 264 hospitals (89 in high-income countries, 166 in middle-income countries, and nine in low-income countries) in 74 countries. Of the 3849 patients, 2231 (58·0%) were male. Median gestational age at birth was 38 weeks (IQR 36–39) and median bodyweight at presentation was 2·8 kg (2·3–3·3). Mortality among all patients was 37 (39·8%) of 93 in low-income countries, 583 (20·4%) of 2860 in middle-income countries, and 50 (5·6%) of 896 in high-income countries (p<0·0001 between all country income groups). Gastroschisis had the greatest difference in mortality between country income strata (nine [90·0%] of ten in low-income countries, 97 [31·9%] of 304 in middle-income countries, and two [1·4%] of 139 in high-income countries; p≤0·0001 between all country income groups). Factors significantly associated with higher mortality for all patients combined included country income status (low-income vs high-income countries, risk ratio 2·78 [95% CI 1·88–4·11], p<0·0001; middle-income vs high-income countries, 2·11 [1·59–2·79], p<0·0001), sepsis at presentation (1·20 [1·04–1·40], p=0·016), higher American Society of Anesthesiologists (ASA) score at primary intervention (ASA 4–5 vs ASA 1–2, 1·82 [1·40–2·35], p<0·0001; ASA 3 vs ASA 1–2, 1·58, [1·30–1·92], p<0·0001]), surgical safety checklist not used (1·39 [1·02–1·90], p=0·035), and ventilation or parenteral nutrition unavailable when needed (ventilation 1·96, [1·41–2·71], p=0·0001; parenteral nutrition 1·35, [1·05–1·74], p=0·018). Administration of parenteral nutrition (0·61, [0·47–0·79], p=0·0002) and use of a peripherally inserted central catheter (0·65 [0·50–0·86], p=0·0024) or percutaneous central line (0·69 [0·48–1·00], p=0·049) were associated with lower mortality. // Interpretation: Unacceptable differences in mortality exist for gastrointestinal congenital anomalies between low-income, middle-income, and high-income countries. Improving access to quality neonatal surgical care in LMICs will be vital to achieve Sustainable Development Goal 3.2 of ending preventable deaths in neonates and children younger than 5 years by 2030
    corecore